Saturday, June 06, 2009

STL Visualization

As of today, KDevelop can nicely display std::vector. I'll probably omit the obvious snapshot, and will point to a mailing list post with instructions for trying it. Instead, I'll tell the story of this feature.

For its entire history, GDB did not have any official way to display types from the C++ Standard Library in a sensible way. Several third-party scripts appeared, written in GDB's internal scripting language. However, they were fairly limited. You had to explicitly run those scripts, and all you got was text output without structure, making robust IDE integration impossible. Also, GDB's scripting language is itself unpleasant, and does not even have access to internal data structures and functions. It was clear that we need a way to write pretty-printers using real scripting language, with full access to GDB data structures, and proper integration with frontend interface.

The first prototype of Python-based pretty printing was written by myself during free hack slot at a CodeSourcery company meeting. It took maybe 4 hours, if not less, and could display std::string as string automatically. Some 4 hours more lead to the first public prototype. This version could automatically display std::vector as "[1,2]". The second prototype could finally display elements of std::vector as children, like one would expect in a variables tree of a frontend, and even report when new elements are added to the vector. However, this version took a couple of days of work, exposed a mere 4 functions from GDB to Python, and was a mess internally. It was clearly already outside the "quick hack" range.

Those prototypes would never turn into anything, were it not for Tom Tromey and Thiago Bauermann, who started a project to add complete Python scripting to GDB. This is much more ambitious than just pretty-printing. In particular, it includes defining new commands in Python, with full access to GDB internals. You can read more details in a post series by Tom.

Pretty-printing became a part of that large effort, and was greatly improved. One of the most notable change was incremental fetch of children. According to the C++ standard, an object does not exist until its constructor has exited. However, gcc debug info just lists all local variables in a block. A naive pretty-printer, when invoked on such a variable, would likely go into uncharted part of memory trying to fetch all children, and never return. To fix this, the Python pretty-printers were designed to use incremental fetch, using Python iterators, and GDB MI interface was also adjusted to be more incremental (yes, it's a trend). Beyond that, we've spend at least 3 weeks iterating on finer details. The GDB patch was finally checked in on Sep 15, and KDevelop4 patch shortly after.

This is still early implementation, and might have bugs, but now it's out for everybody to try.

Monday, June 01, 2009

Linking 101

Recently, I see more and more people having trouble with link-time errors—as if such an error is the worst kind of luck and cannot be fixed by mere mortals. There are many possible reasons, including Java as default language in universities, and alarming spread of header-only-philia, but that's for another post. Here, I want to give a simple diagnostic procedure for link-time errors.

Let's lay some groundwork first. If your job is programming in C++, you need to know what the -I and -L options do, and how they are different. Also, given a full path to a library file (with .a or .so or .lib extension), you should be able to link to that file—in two different ways. If you don't know any of the above already, all hope is lost—you might want to consider other occupations. Otherwise, let's look at the diagnosis steps for most common error—'undefined symbol'.

First, understand where the missing symbol is supposedly defined. Educated guess is usually fine. For example, a symbol named boost::system::foobar is most likely contained in the Boost.System library (and it's surprising how many folks fail to guess so). Then, find how you are supposed to link to that logical component, using documentation for the component or the corresponding Linux package. For example, you might decide to add -lboost_filesystem to the linker command line.

Second, make sure that used physical library file is the right one, and that the linker is not picking a different version of the library from a directory you don't expect. If you get error during linking of your application, you can use the -t flag for the GNU linker (or use -Wl,-t on gcc command line). This will print full paths for every library used, including those specified with the -lfoo syntax. For static linking, this will also tell which object files from the static libraries were used. If you get error when running the application, you one can use the LD_DEBUG environment variable. If you set that variable to help prior to running your program, you'll get a list of possible values. The most handy value in our case is files.

Third, if you seem to link to the right library, there are three further possibilities. First, maybe the library actually should not include the symbol. This can happen if you use wrong headers during the compilation, and can be debugged by passing the -save-temps option to gcc and checking the generated .ii file. Second, the symbol might be almost there—but slightly different—either using different calling convention (on Windows), or wchar_t mode (also on Windows) or a somewhat different types of parameters, or different namespace. In that case, you'll have to make sure the compiltation options of the application match library's requirements. Finally, it could be that the library actually lacks the symbol due to library bug, and you have to complain to maintainer. To distinguish those cases, you need to manually examine the list of library symbols. With gcc, the 'nm' command will do for static libraries, while 'readelf' can be used on shared libraries (Unix only). I don't know the best way on Windows, suggestions welcome.

That's it for the common case. Below, I list some relatively common specific problems. The list does not claim to be complete, so if you know some other cases, drop me a line.

Static linking. For static linking, the order of libraries on the command line matters, so if you don't see the linking grabbing the object file with your symbol, you might want to either reorder the libraries or use the --start-group option. See ld documentation for details and note that the performance cost of the --start-group option might not be a concern these days.

References to vtable. The GNU C++ compiler sometimes reports unresolved reference to 'vtable for SomeClass'. This generally is a pure way to say that the first non-inline method of SomeClass is not defined. See GCC FAQ

Windows DLLs. On Windows, if an application wants to use a function in DLL, then both DLL and the application should record this intention, using __declspec(dllexport) and __declspec(dllimport) pair. If either party does not do so, linker complains. With mingw, a typical error is undefined reference to `_imp___WHATEVER'. It means that the library is static, whereas the applications wants to use shared library.

Windows import libraries. On Windows, it's not possible to directly link to a DLL. Instead, an import library is created and used—typically by passing /IMPLIB option to the linker. If the linker does not report any errors, but does not produce import library either, it's a sure sign that you have not exported any function from the DLL, and have to check the logic that adds __declspec(dllexport)

64-bit compilation. When building 64-bit applications, you can get an error that say something about "relocation R_X86_64_32", and suggesting the -fPIC option. The issue here is that 64-bit applications should include only code compiled with -fPIC, and if you link against any static libraries, those libraries should also be compiled with -fPIC.

Wednesday, May 20, 2009

KDevelop error display

For quite a while I wanted KDevelop to display compilation errors directly inside the editor, as opposed to separate window you have to click in. It works now, as shown below. This was implemented by Ivan Ruchkin, a student at Moscow State University, who will be defending a term paper about various KDevelop-related work tomorrow. The patches will be posted to appropriate mailing lists right after that.

Saturday, April 19, 2008

Variable tooltips

The most voted-for feature request for KDevelop3 debugger was variable tooltips. I don't think KDevelop3 will ever get them, but KDevelop4 will, as shown below.

Friday, December 21, 2007

Debugger stories: pending breakpoints

KDevelop 3.5 has a subtle bug. Sometimes, when you step over a function call, you don't stop on the next line. Instead, the application is resumed until it hits a breakpoint, or exits. This bug, in fact, is consequence of how breakpoints in shared libraries are implemented.

Suppose you've just started a debugger, and try to set a breakpoint on a function in a shared library. The library itself might not be loaded yet, in which case GDB cannot find the address of the symbol to set the low-level breakpoint. To handle this case, starting with version 6.1, GDB supports pending breakpoints. Such breakpoints don't correspond to any address in program, they only keep the specified breakpoint location as string. Whenever a new shared library is loaded, GDB tries to re-parse breakpoint location again, and if that succeeds, creates an ordinary breakpoint.

Now, this does not work when using the MI interface, for a couple of reasons:

  • When a pending breakpoint is resolved, it is deleted, and new one is created. And GDB fails to inform MI frontend about this.
  • It's actually not possible to create pending breakpoint using MI at all.

Because of these issues (and a bit of historic reasons) KDevelop 3.5 simulates pending breakpoints. GDB is asked to stop whenever a shared library is loaded, and when that happens, KDevelop tries to reinsert breakpoints. This works pretty well, except for the bug I mention in the beginning. Suppose you're stepping over a function call (this uses the "next" command on GDB level). The function opens some shared library, and which point GDB stops and KDevelop tries to reinsert breakpoints. After that KDevelop would like to continue the "next" operation, but it's already aborted by GDB. All we can do is continue the program.

But it's not longer the case today. As I wrote earlier GDB was recently modified so that a breakpoint can correspond to several addresses, such as of template instantiations. A breakpoint is re-evaluated each time a shared library is loaded or unloaded, and locations are added to breakpoint and removed as appropriate, but it remains the same breakpoint. The nice side effect is that pending breakpoints are now just breakpoints with zero locations, that are reevaluated just like other breakpoints, and don't ever change their number.

In addition to that, I wrote patches to add pending breakpoint support to MI -- which mainly involved getting rid of two parallel breakpoint-setting code paths -- one for MI and one for CLI. Thanks to review of Joel Brobecker and Daniel Jacobowitz, those patches went in GDB CVS eariler this month. KDevelop 3.5 SVN was modified to automatically detect and use this GDB feature. So, if you're willing to build CVS HEAD of gdb and KDevelop from KDE 3.5 branch, you can finally have breakpoints in shared library just working.

This was probably my last KDevelop 3.5 commit. KDevelop 4 is ahead.

Monday, November 26, 2007

Breakpoints in constructors

Presently, no release of GDB properly handles breakpoints in contructors. This summer, I've worked on fixing that, and while it took longer than expected, it was eventually done, just in time for Sourcery G++ Fall release. The patches were also submitted for GDB FSF, missed the window for 6.7, but will be present in 6.8 release.

The underlying problem with breakpoints in constructors was that gcc generates two distinct function bodies for a constructor. One is a regular one that constructs the entire object, including all bases. Another one constructs everything except for virtual base classes. As it happens, gcc emits both constructors even for classes that have no virtual bases at all. GDB was not prepared that a given function name or source line corresponds to several addresses in program, so it picks one. And usually it picked the wrong one.

Constructor is the most common case, but is not the only one. If you set a breakpoint in a function template, you can have multiple template instantiations that correspond to a source line. An inline function can be inlined in multiple places, and lead to exactly the same problem.

The solution, obviously, is to teach GDB that a breakpoint can correspond to several addresses, and then create multiple-location breakpoints when needed. Now, whenever a user creates a breakpoint that resolves to a source line, GDB traverses line tables for all modules, and if it finds another address for the same line, that address is added to breakpoint. For a template or inline function, you can end up with quite a lot of locations, so you can review list of locations, and disable the unwanted ones.

The nicest bit of this is interaction with shared libraries. Say, you've set a breakpoint inside function template. If you load a new shared library, and it contains an instantiation of that function template, a new location will be added to the breakpoint, transparently. If a library is unloaded, the location will become 'pending', until you load the library back.

The side effect of this work was a serious improvement in the way breakpoints in shared libraries work, but that's a topic for another post.

Thursday, February 01, 2007

Debugger Stories: Stack widget

In KDevelop 3.4, the stack widget was not changed much. I can remember just two changes­—one that is apparent and one that is subtle.

The apparent change is that we actually parse gdb output, and show it it a readable way, while in KDevelop 3.3 the stack frame formatting was entirely at mercy of gdb's "backtrace" command.

The subtle change is at the bottom of the screenshot—that "(click to get more frame)" thing. When a program stops, KDevelop fetches very few frames from gdb. If you click on that last item, then another chunk of frames will be fetched.

This behaviour is needed for two reasons. First, if your program is stuck in infinite recursion, and you try to interrupt it from KDevelop, in KDevelop 3.3 you're out of luck. As soon as the program is interrupted, KDevelop asks gdb for the list of all frames. Since your program is in infinite recursion, the number of frames is very large, and gdb is not very speedy stack-walker. So, you get to wait 5 mins for the stack to be shown. With incremental display, in a few clicks you'll see what function went astray.

The second reason is embarassing. Even without infinite recursion, getting the list of frames from gdb takes a lot of time. Something like half-a-second for getting 30 frames is not unheard of. Ideally, we'd fix gdb, but since we need incremental fetch anyway, fetching sufficiently small number of frames initially greatly improves responsiveness.